Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a fast key hasher instead of password hashers #244

Conversation

davidfischer
Copy link
Contributor

@davidfischer davidfischer commented Sep 1, 2023

  • Uses sha512 instead of Django's password hashers for hashing API keys. As outlined here, the purpose of password hashing is to make brute force attempts very hard by making a password check slow (~100ms+). That is likely unnecessary for keys that are randomly generated like in this project. The sha512 hasher is much faster (<1ms).
  • Transparently upgrades any keys encountered that use the slow password hashers with the sha512 hasher whenever that key is used.

Ref: #173

Notes

  • If the basic code is approved, the docs will need a significant overhaul to explain this change. I figured I'd hold off on that until there's consensus that this is the direction to go. Docs are updated in the PR.
  • While sha512 is likely very very far from being broken on high entropy, randomly generated keys, it may make sense to make the hasher pluggable. That seems like over-engineering for now.

# if it is using an outdated hasher.
if valid and not key_generator.using_preferred_hasher(self.hashed_key):
new_hashed_key = key_generator.hash(key)
type(self).objects.filter(prefix=self.prefix).update(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I acknowledge that this code feels very awkward due to the full hash and algorithm being part of the primary key (id). That doesn't seem avoidable if this is the direction we want to go.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, probably not avoidable. The shape of the ID field is what I'd consider a historical design flaw, tracked in e.g. #128. Let's do along with it for now.

@codecov-commenter
Copy link

codecov-commenter commented Sep 1, 2023

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (13fe987) to head (b8ce2f7).
Report is 6 commits behind head on master.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff            @@
##            master      #244   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           24        25    +1     
  Lines          593       629   +36     
=========================================
+ Hits           593       629   +36     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@beda42
Copy link

beda42 commented Sep 11, 2023

I quite like this solution. The only thing that it is missing is the option to stay with the old hashing scheme.

I think that the move from PBKDF2 to plain SHA-512 is the right one, so I do not care much about that for my own use. But I can imagine there are users of this library who would like to have more control of that.

Anyway, it would be great if we could have this reviewed and move it forward. I would love to see this released soon to get rid of some of the unnecessary load on our servers.

Is there anything I can do to help?

Copy link
Owner

@florimondmanca florimondmanca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for this contribution!

The code looks good to me. Appreciate the new test.

I went ahead and tested this locally on the test_project.

I created an API key on master (using old hasher), then moved to this branch and ran the following script. It prints the first execution time (when old hasher is used), then the mean and std dev of all 100 test requests (99 of which should use the new hasher after automatic migration).

import httpx
import statistics

api_key = "..."

def api_key_auth(request):
    request.headers["Authorization"] = f"Api-Key {api_key}"
    return request


def timeit(url, auth):
    times = []

    with httpx.Client() as client:
        for n in range(100):
            response = client.get(url, auth=auth)
            times.append(response.elapsed.total_seconds())
            if n == 0:
                print(times[0])

    m = statistics.mean(times)
    s = statistics.stdev(times)

    print(f"{url=:<20}: {m=:.3f} seconds ({s=:.3f} seconds)")

timeit("http://localhost:8000/api/protected/", auth=api_key_auth)

The output is:

0.17201
url=http://localhost:8000/api/protected/: m=0.010 seconds (s=0.017 seconds)

Note: the mean includes the first iteration so it is over-estimated. A second execution of the script brings the mean down to 5ms instead of 10ms.

So that's a ~30x speed improvement. :-) (I a lower factor in production environments due to latency, this was a test on localhost.)

So this is great. I suppose we could implement a way for users to opt out of this new more efficient hashing strategy, but I'm OK with merging (and releasing?) this in the current state.

Any folks would like to give more eyes on this? @beda42?

@florimondmanca
Copy link
Owner

@davidfischer About the docs update:

If the basic code is approved, the docs will need a significant overhaul to explain this change. I figured I'd hold off on that until there's consensus that this is the direction to go.

It does look like some consensus is reached, what do you reckon? Would you like to include docs changes in this PR?

@beda42
Copy link

beda42 commented Sep 11, 2023

@florimondmanca I already went through the code and I am quite happy with it. I do not think that there needs to be an opt-out possibility as the new version seems simply better 😉

@davidfischer
Copy link
Contributor Author

It does look like some consensus is reached, what do you reckon? Would you like to include docs changes in this PR?

I think it makes sense to update the docs at the same time. I'll try to update the docs in this PR in the next couple days.

@davidfischer
Copy link
Contributor Author

I updated the docs which ended up not being as large a project as I suspected.

Copy link
Owner

@florimondmanca florimondmanca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation update is neatly on point. Thank you very much!

@florimondmanca
Copy link
Owner

Alright, let’s roll with this one then…

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants